Blar i NTNU Open på forfatter "Arizaleta, Mikel"
-
Structured data extraction: separating content from noise on news websites
Arizaleta, Mikel (Master thesis, 2009)In this thesis, we have treated the problem of separating content from noise on news websites. We have approached this problem by using TiMBL, a memory-based learning software. We have studied the relevance of the similarity ...